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Abstract 

In this paper we consider the space of those probability distributions which maximize the 
g-Renyi entropy. These distributions have the same parameter space for every q, and in the 
q = 1 case these are the normal distributions. Some methods to endow this parameter space with 
Riemannian metric is presented: the second derivative of the q-Renyi entropy, Tsallis-entropy and 
the relative entropy give rise to a Riemannian metric, the Fisher-information matrix is a natural 
Riemannian metric, and there are some geometrically motivated metrics which were studied 
by Siegel, Calvo and Oiler, Lovric, Min-Oo and Ruh. These metrics are different therefore 
our differential geometrical calculations based on a unified metric, which covers all the above 
mentioned metrics among others. We also compute the geometrical properties of this metric, 
the equation of the geodesic line with some special solutions, the Riemann and Ricci curvature 
tensors and scalar curvature. Using the correspondence between the volume of the geodesic ball 
and the scalar curvature we show how the parameter q modulates the statistical distinguishability 
of close points. We show that some frequently used metric in quantum information geometry can 
be easily recovered from classical metrics. 

1 Introduction 

In theoretical statistics and in applications the distance functions between probability distributions 
play an important role. The construction of a proper distance function has been considered by several 
authors. But even the same statistical model with different mathematical frameworks can lead to 
different distance functions. To narrow the family of potential distance functions we consider those 
which are natural from differential geometrical point of view. 

Historically the pioneering work of Mahalanobis [23] was generalized by Rao [HO], who first 
suggested the idea of considering the Fisher information |14j as a Riemannian metric on the space of 
probability distributions. Cencov [8] was the first to study monotone metrics on statistical manifolds. 
He proved that, up to a normalization, there exists a unique monotone metric, the Fisher information. 
Amari [3] and Amari and Nagaoka @] provide modern account of the general differential geometry that 
arises from the Fisher information metric. The Fisher metric was studied further by Akin lj, James 
PU, Burbea 0, Mitchell [22], Atkinson and Mitchell [5J, Skovgaard [33], Oiler [25], Oiler and Cuadrasa 
[27] . Oiler and Corcuera [26] among other researchers. The combination of differential geometrical and 
statistical studies helped to find the statistical interpretation of geometrical quantities. For example 
the geodesic distance between probability distributions, which is usually known as Rao distance is a 
natural distance function between probability distributions; the statistical meaning of the so-called 
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e-curvature was first clarified by Efron [12] ; the normalized volume measure of the manifold is called 
Jeffreys' prior [T7] within the field of Bayesian statistics. 

In this paper we consider the space of those probability distributions which maximize the g-Renyi 
entropy These distributions have the same parameter space for every q, and in the q = 1 case these 
are the normal distributions. The first results about the geometrical properties of these spaces are 
due to Amari [3J [2] . He considered the Fisher information metric on these manifolds and computed 
some geometrical invariants. Some methods to endow the parameter space with Riemannian metric 
is presented: the second derivative of the g-Renyi entropy |31j . Tsallis-entropy [35] and the relative 
entropy give rise to a Riemannian metric, the Fisher-information matrix is a natural Riemannian 
metric, and there are some geometrically motivated metrics which were studied by Siegel [33] , Calvo 
and Oiler [7] and Lovric, Min-Oo and Ruh [32]. These metrics are different therefore our differential 
geometrical calculations based on a unified metric, which covers all the above mentioned metrics among 
others. We also compute the geometrical properties of this metric, the equation of the geodesic line 
with some special solutions, the Riemann and Ricci curvature tensors and scalar curvature. Using the 
correspondence between the volume of the geodesic ball and the scalar curvature we show how the 
parameter q modulates the statistical distinguishability of close points. We show that some frequently 
used metric in quantum information geometry can be easily recovered from classical metrics. 



2 g-Renyi entropy maximizing distributions 

The normal distributions can be introduced as a result of the maximum entropy principle. Consider the 
family of density functions which are continuous and supported on the real line with given expectation 
value and variance a 2 £ K. Introducing the Lagrange multipliers a, b, c we have the following 

functional on the family of probability distributions 



S(p) =- J p(x) log p(x) dx-a^j p(x) dz-1 

-b[ p(x)x dx — jU ) - c I / p{x)(x ~ /i) 2 dx - a 2 



The variation of the functional is 

SS = J(-logp(x)-l-a- bx -c(x-^)Sp(x) dx. 

The functional has extremal point at p if its variation is zero. One can show that the entropy functional 
has local maximum at the point 

p(x) — exp (—a — bx — c(x — ^i) 2 ) 

for appropriate parameters a,b,c € R. 

The family of one dimensional normal distributions Si can be parameterized by the expectation 
value «el and the parameter d £ KL + as 

f {d ,u,x) = ^Le-^-^. 



This means that Si can be identified with a 2 dimensional space Si = R + x ML The statistical 
properties of the distributions lead us to define Riemannian metric on the space Si. 

In general, the family of n dimensional normal distributions S n can be parameterized by the 
expectation vector u £ K ra and the inverse of the covariance matrix D. Let us denote the set of real 
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symmetric strictly positive nxn matrices by M n - Then we can identify the sets S n and S„ = M n x I 
using the following one-to-one map 



where 



f(D,u,- 



VdetD 



cxp — — {x — u, D(x — u)) 



Normal distributions with zero expectation will said to be to special normal distributions. The 
parameter space of the n dimensional special normal distribution is S^f = A4„ . 

One can generalize the above mentioned procedure to extend the notion of Gaussian distributions 
using the g-Renyi entropy [31] . Let us fix a parameter q € R + \ {1} and consider a density function 
p. The g-Renyi entropy of the distribution p is 



Sq(p) = r— — l°g / P{xY dX 

1-9 Jr 



if the integral exists. 

The g-Renyi entropy maximizing distribution is the following. For a given n € N\{0} the parameter 
space is S„ = M. n x R". For a parameter (D,u) 6 S define the set 



x e 



Dom(p, D, u) = 

and define the density function as 
f p {D,u,x) = 



1 + 



1-p 



2p — n(l — p) 



(x-u,D(x-u)) > 



if p6 
if p > 1 



n + 2 1 



_ J ,i„.„\ •«!(■; D { 1+ 2p ^ n J_^ (x-u,D(x-u))^ 

0, 



if x € Dom(p, D, u); 
if a; ^ Dom(p, D, u). 



The normalization constant of the generalized p-Gaussian distribution is 



An ■ 



1-p 



2p — n(l — p) 
2p — n(l — p) 



r 



1 

l-p 



Tr^r 
§ r 



1 

i-p ' 

P 1 n 
p-1 ~ 2 



if pe 

if p > 1 



n + 2 



.1 



For a given parameters n £ N \ {0} and p > we can 

Mp = {/ p (D,2faO I (-D,h) € H„} 

the family of p- generalized Gaussian distributions. This forms a manifold parameterized by (D,u). 
This is an a-family of probability distributions, where a = 2p— 1 and is a-flat (see Amari and Nagaoka 
[1]). The present paper studies the geometrical structures of M p . 

If we consider the limit q — > 1 then the g-Renyi entropy tends to the entropy. From this point on 
we will allow the p = 1 case, and we will consider it as the usual Gaussian distribution, and in the 
p = 1 case we sometimes omit the index p. The set 

Af = \ (n,p) € N x K I n > 0, p > 



n + 2 
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can be considered as the label set of the p-Gaussian distributions, and for every pair (n,p) € Af the 
parameter space of the n-dimensional p-Gaussian distributions is S„ = M. n x R" and the parameter 

(s) 

space of the special Gaussian distributions is S„ ' = M. n - 

We present a Theorem which shows the maximum g-Renyi entropy property of the p-Gaussian 
distributions in the q = p case. The maximum Renyi entropy problem was solved by Moriguti in the 
scalar case 24 . The distribution function was remarked by Zografos [37) in the multivariate case, but 
not connected to the Renyi entropy. The problem was solved first by Kapur 19J in the multivariate 
case, Johnson and Vignat also solved the problem in the multivariate case [H] using the result of 
Lutwak, Yang and Zhang [21] . Costa, Hero and Vignat [9] established properties of multivariate 
distributions maximizing Renyi-entropy, under a covariance constraint. 

Theorem 2.1. For any probability density g : M" — > M + with fixed covariance matrix K , expectation 
u £ R™ and parameter q > > 

S q (g) <S q (f q (K-\u,-)), 
with equality if and only if g — f q (K , u, ■) almost everywhere. 

Important to note, that the p-Gaussian distributions maximize not only the g-Renyi entropy, but 
the Tsallis entropy too, defined by equation and minimize a-relative entropy (defined in the next 
Section) between the uniform distribution and an arbitrary one. 

We call the family of probability distributions f P (D,u,-) ((n,p) £ TV, (D,u) € S„) extended 
Gaussian distributions. 



3 Riemannian metrics on the space of extended Gaussian 
distributions 

The parameter spaces 2„ and have a natural manifold structure. Let us denote the space of real 
symmetric n x n matrices by M n - Then at the point {D,u) £ S„ the tangent space T(d,u) S n can be 
identified by T„ = M n xR™, since one can consider the tangent vector (X, x) as a derivation defined 
for any smooth function h : S — > R as 



dh(D,u) d,, n 

v - — h{D + tX,u + tx) 



(1) 

t=Q 



d{X,x) dt 
In this setting a map 

g : E n x T„ x T„ - C ((£>,«), (X,x), (Y,y)) ~ g D ^((X, x), (Y,y)) 
will be called a Riemannian metric if the following conditions hold. For all (D,u) G 3 n the map 

g D& : T n x T„ -» C {(X,x), (Y,y)) ~ g D ,u{{X,x), (Y,y)) 
is a scalar product and for all (X, x) € T„ the map 

g((X,x), (X,x)) :E n ^C (D,u) >-> g D ,u{{X,x), (X,x)) 

is smooth. 

Now we present some ideas how the space S n can be endowed with Riemannian metric. For 
example the (g-Renyi) entropy can generate a Riemannian metric: because the following Theorem 
shows that q-Renyi entropy is a convex functional, so its second derivative is a strictly positive 
symmetric linear map, and therefore it can define a Riemannian metric. 
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Theorem 3.1. For every pair (n,p) 6 M and (D,x) € S ra the q-Renyi entropy (q € R+J of the 
distribution f p (D , u, •) is 



ifP,Q> 1 : 



S g (/ P (Aa,-))= 2 l0 ' 



n 7r(2p — n(l — p)) 1 



p-1 



1-9 



log 



p 

p-1 



_2_ _i_ 1 i n 

p-1 ' 1 T 2 



--logdetD 



(2) 



i/P < 1, 9 > 



ri(l — p) 



S q (f p (D,u,-)) = -loj 



7r(2p - n(l - p)) 1 



I-P 

Proof. First we compute the integral 

J = 



1-9 



log 



r 




f- 2 - 




r 


' i 

vi-P a. 







Dom(p,i).u) 



f P (D,u,x) q dx. 



--logdetD. (3) 



(4) 



Choose our new coordinate system in R" parallel to the eigenvectors of D. In this coordinate 
system D is diagonal, with entries (\i)i=i,..., n - If p > 1 then with the variables a — 2 p-n(p-i) 
and %)i = y/a\i(x{ — Ui) the integral is 



I = 



Al p (detD)^ 



s„(i) 



dy- 



k=l 



In spherical coordinates this equation is 

a 2 Jo 

where F n is the surface of the n dimensional sphere with unit radius 

n.ir 2 

Evaluating the integral 



(l -r 2 )"- 1 r n - x dr = 



r(f + !)• 

r(^r + i)r(f) 



2r 



we have 



-(;) ■ 



(g-i) /r 



Vp-i 



f^T + 1- 

Vp- 1 



p-1 



r Mr / r i + 1 + 2 



p-i 



(detl?)" 



(5) 



and this verifies the Equation @ . If p < 1 then the integral (01 is 



a% Jo 



+ r 2 )"- 1 r n - x F n dr 
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after the substitutions a = 2p - n (p-i) an< ^ ^* = ^/ a ^i( x i ~ u i)- Evaluating the integral we get 



(i-p) \ r (i-p 2) 



( _J n 

\l-p 2 



l-p 



(detD)- 



which leads to Equation ©. 



(6) 



□ 



Since the g-Rcnyi entropy is independent of the expectation vector u the entropy cannot generate 
a Riemannian metric on the whole space S just on S„ . The g-Renyi entropy can be written in the 
form of 

S q (f p (D,u,-)=C n>p , q -^logdetD, (7) 
so the quadratic form generated by the functional S q on the space of p-Gaussian distribution for every 

(s) 

point D £ S„ and tangent vectors X, Y g T„ being 



d 2 



dsdt PU " 
is independent of g and p 



S p (f q (D + tX + sY,O r ) 



1 9 2 



t=s=0 



2 9s9t 



(logdet(D + tX+sy)) 



i=s=0 



Theorem 3.2. For every pair (n,p) G A/" /or every point D S Sn and /or every tangent vectors 
XY € T„ we /lave 



1 



(8) 



g^ ) (X,Y) = -TiiD- 1 XD- 1 Y) 
for the quadratic form generated by the q-Renyi entropy. 

Proof. To compute the derivative of the function — ^logdet-D we use the following equalities for 
symmetric strictly positive matrices 

/>oo 

logdet^l = TrlogA \ogA = (E + tE)- 1 - (A + tE)- 1 dr, 

Jo 

where E denotes the identity matrix. Then the derivative is 



d 2 _ 1 

dsdt ~ 2 
1 d 2 



2 dsdt 



\ogdei{D + tX + sY) 

t=s=0 

logdet(£ + tXD- 1 + sYD- 1 



1 d 2 



2 dsdt 



log(det D)(det(E + tXD- 1 + sYD^ 1 ' 



t=s=0 



1 d 2 



2 dsdt 



Trlog(£ + tXD- 1 + sYD' 1 ) 



t=s=0 



1 



Tr 



pea pa 

J £di ( {E + tE) ~ {e + tXD + sYD + tE) ~ ) 



^Tr J (E + tE)- 1 (yD-\E + tE)- 1 XD- 1 + XD-\E + tE)- 1 YD- 1 Ye + TE)- 1 dr 
(TtXD^YD- 1 ^ (1 + t)- 3 dr. 



This proofs the equality g^\x,Y) = \ Tr(D~ 1 XD~ 1 Y). 

The Tsallis entropy [35 of the probability distribution / is defined as 

1 



□ 



S iq) (f) 



1-q 



f{xf dx-1 



(9) 
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for parameter q G K + \ {1}. Let us denote the quadratic form generated by the Tsallis entropy by 
g(T,p,q)^ j e £ or ever y point D G 5„ and tangent vectors 1,7 e T„ 

d 2 



g ( o' p ' q \x,Y) 



dsdt 



S {<l) (f p (D + tX + sY,0,-) 



if SM(f p (D + tX + sY,0, •) is well defined. 

Theorem 3.3. For every pair (n,p) G J\f for every point D G Sjj' and /or every tangent vectors 
X, Y G T„ toe /jave i/ie following expressions for the quadratic form generated by the entropy 



g ( D T ' p ' 1 \x,Y)=g { J X1) (X,Y) = ^(D^XD^Y). 
If S (fp(D,u,-)) is well defined, then the generated quadratic form is 

gg*'*\x,Y) = A' ntM det(D)^ (^{D^XD^Y) - ^TrXD^TrYD 



(10) 



(11) 



whe 



p-i 



2 \n(2p - n(p - 1)) 



1-p 




1 

2V7r(2p-?i(p- 1)) 

Proof. Since we have the limit 

lim S^(f p (D,u, •)) = lim S q (f p (D,u, •)) - S(/ P (0,lfe 0) 

the formula g D ' (X,Y) — g D ' : (X, F) is straightforward from the Equation ([7]) and the metric 
was computed in the previous Theorem. 

Now let us compute the derivative of the determinant function. 



dt 



(det(D + tX)) 



= det D — det(E + tXD~ 1 )) 
t=o di 



t=o 



detDTr— / (E + tE)- 1 - (E + tXD- 1 +TE)- 1 
dt Jo 

detDTr / [E + tE)~ 1 XD~ 1 (E + tE)^ 1 dr = (detD)(Tr XZT 
J o 



detD — (expTrlog(L» + <X)) 
dt 



dr 



t=o 



This can be expressed as (ddet)(Z>)(X) = (detD)(TrXD 1 ). The second derivative of the 
determinant is 

(d 2 det)(D){X)(Y) = (det D)(TtXD- 1 )(Tt YD- 1 ) - (detD)(Tr XD^YD' 1 ). 
According to these equalities the derivative of the function (det D) is 



(d 2 det^)(D)(X)(Y) 



if 



dsdt 
p-l 



(det(Z) + tX + sY))~ 



t=s=0 



(det D) 



Tr XD^YD- 1 + - — - Ti XD^ 1 TtYD^ 1 



This gives the Equation (fTTj) and the constants A' n come from the Equations ()5IG|) 



□ 
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The Fisher information matrix is defined on parametric probability distributions. At a point 
(D,u) G H„ the quadratic form 



9^{{X,x),{Y,y)) = J Mn f q (D,u,z) 



8{Y,y) 



dz 



(if the integral exists) gives rise to a positive definite matrix, so inducing a Riemannian metric on 
3 n . This metric is often called the expected information metric for the family of probability density 
functions; the original ideas are due to Fisher [14] and Rao [30] . 

Theorem 3.4. For every pair (n,p) G Af, where p < 2 for every point (D,u) G S n and for every 
tangent vectors (X,x), (Y,y) G T n the Fisher information matrix of M p is 

9 { n£((X,x), (Y,y)) =^^y TriD^XD^Y) + ^=-^ Tr(D^X) Tr(D^Y) 

2 + n(p - 1) 
+ (2p + n(p-l))(2-p) { -> V y- 

Proof. At a point (D,u) G S we have 

log/ P (A&£) = log A n>p + ^ log det £> + —!— log (l + - r(x-u,D(x-u))) . (12) 

2 p— 1 \ 2p — ra(l — p) / 

Choose our new coordinate system in W 1 parallel to the eigenvectors of D. In this coordinate system 

D is diagonal, with entries (Aj)i = i n . Let us denote by (ek)k=i n the orthonormal basis. According 

to Equation ([T]) the partial derivative of Equation ([T2"]) with respect to a basis vector is 

d\ogf p (D,u,x) 2 1 . . 

^ = o n 1=55 ; ^7 rr*k{xk - Uk)- (13) 

9(0, ek) 2p - n(l - p) 1 + {x -u,D{x- u)) 

First we consider the p G — ttt, 1 case. The Fisher information is 

4^(0,^,(0,6,)) = 

4A fc A ; A n , p Vdet~D f ( 1-p u ^ 

(2p-n(l-p))' i R „ I, 1 + 2p-n(l-p) (£ " * ^ " ^ J ( ^ " " " ;) d£ 

Introducing the new variables a — 2p -n(i- P ) ? 2/* = v^^I^i ~~ we have 

l _2 



If n= 1 



and if n > 1 then in the spherical coordinates in n — 1 dimension we have 

/l /I \ /"CO /"OO , 
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Using the integral formulas 

(l+ ? i +r 2)— - 2 j / 2 7 , n -2 dyfcdr= U P V / (1+r2) ^ T -i r „-2 dr 



2r 



nip/ n—1 11 



2r 

and after some simplification we have 



l-p ^ 2) 



- (2p ^^_ p) ^y) (14) 

which is valid for every n € N\ {0}. Since D is diagonal the partial derivative of the Equation (fT2|) is 
d\ogf p (D,u, x) _ 1 o (g, - u t ) 2 



d(E u ,0) 2A, p-ll + a(x-«,D(x-2i)>' 

The Fisher information is 



(15) 



ggff ((E u ,QUE kk ,0)) = + ^(l + aOE-ttDOE-iO))^- 1 ^,-!!*) 3 di 



aj4 n>p vdet 13 
+ 2(p-l)Ai 

a 2 ^4n,p\/det 5 
+ (p - 1) 2 

The first integral is 



(1 + a(x -u, D^-u)))?- 1 1 (x k - u k ) 2 dx 
(1 + a(x~u, D(x - u)))~~ 2 (xi - Ui) 2 (x k - u k ) 2 &x. 



I ; r ^(l + yf+r 2 )^- 1 ^- 2 ^-! dyidr 



2;/) 1 )a •• A,, A, Jr. J x 4 " ' " 1A, A;, 

The third one is if i =/= k 

\ />oo />oo />oo -1 
sl-n.n I I I / 9 9 9 \ — 1 9 9 9 ^ . . - -L 



1 1 1 (l+y 2 + y 2 k + r 2 )^- 2 y 2 y 2 r n ~ 3 F n „ 2 dy* dy k dr 



(p- l) 2 fl2 AfeAi J J-ooJ-oo 4(2-p)AjA fc 
and if i = fc 

yl /*00 />00 O 



(p-l) 2 «tA 2 7 "* ' "' — 4(2-p)A 2 ' 

Combining these integrals 

If D is not diagonal this can be expressed as 

g%*> ((X,0) ) (y,0)) = ^i-^Tr^XD- 1 ^) + Tr^X) Tr^Y). (16) 
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Finally the formulas (|14ll6p give us the metric since 

9 i Dfm i ,0),(0,e k )) = 0. (17) 

If p > 1 then the partial derivatives given by the Equations ()13I15|) are the same. The Fisher 
information is 



9d,u (0,efc),(0,ei)) = _ y (1 - a(x - u, Z% - u)» ^ k-«fc)(ir«i) di, 

Dom(p,D,u) 

p - 1 

where a = -. Introducing the new variables yi = V 'a\i(xi — ui) we have 

2p — n[l — p) 

1 n 

^ ) ((0,e k ),(0,e i )) = ^% 7 ^%/ f 1 - E*?) ^ 1/2 dg, 

112 "'Sn(l) V i=l J 

where J5 n (l) is the closed unit ball in R™ with center origin. If n = 1 

<;>((o, .),«>,.)) - £ d-v^Y d 9 - 

and if n > 1 then in the spherical coordinates in n — 1 dimension we have 

^((o, a), (o lS )) = d ^j^Tyi J ( x - *2 - r2 )^~ 2 «2^- a ^i d» dr. 



Evaluating the integral 

(l-^-r 2 )^- 2 y|r"- 2 dy fe dr 



iv^(¥)r(^i-i) 



4 r 



(jL + 2 



we get again Equation (|14|) for every n. The Riemannian product of the matrix units is 



ggf ((E u ,0),(E kk ,0)) = IJL - J (l-afe-a^-MjD^^-^dx 

Dom(p,D,u) 



aA njP ydet D 



2(p-l)X l 

Dom(p,i),u) 



(1 - a(x - u, D(x - v.)})"- 1 1 (x k -u k ) 2 dx 



a 2 A n „vdet D 



(1 - a(x - u, D(x - u)))"- 1 (xi - Ui) 2 {x k - u k ) 2 dx. 



(p-1) 2 

Dom(p,D,u) 

Introducing the variables yi = y/a\~i(xi — Ui) the domain Dom(p, D, u) will be transformed to B n {\). 
Evaluating the integrals we get Equation (fl6|) . Finally we note that Equation (fT7|) is valid in this 
p < 1 setting too, and this completes the proof. □ 



10 



In the p > 2 case the Fisher information does not exist, since the integral which defines it divergent. 

Relative entropies are special distance functions between probability measures and although there 
are several relative entropy functions, but most of them are special Csiszar (^-divergences [TU1 ITTj . 
Assume that ip : R + — ► R is a strictly convex function, and <p(l) = 0. Then one can define the Csiszar 
(^-relative entropy as 



H^(f p (D i ,u 1 ,-),f p (D 2 ,u 2 ,-)) = I MDi,Ux,x)v( { p{ ° 2, - 2 '-\ 



dx. 



For example the Kullback-Liebler [20], Hellinger [15] and a-relative entropies are given by the 

functions ip(x) — — logx, ip(x) = (1 — \fx) 2 and ipix) = - (1 — x 1 ^). We note that the 

1 — a z \ / 

a-relative entropy is strongly related to the Renyi [3T] and to the Tsallis entropy [35] [3B]. The 
quadratic form induced by the (^-divergence is 



g^:\(X,x),(Y,y)) 



d 2 



dsdt 



H^(f p (D,u, •), f P {D + tX + sY 1 u + tx + sy, •)) 



t=s=0 



Theorem 3.5. Assume that ip : M + 
g (<P, P ) = p'l^giFrP) on the manifold S. 



is a strictly convex function and 



0. Then 



Proof. The computation 



g%': ) ((X,x),(Y,y)) 



d 2 



H M (f P (D,u, ■), f p (D + tX + sY,u + tx + sy, •)) 

't=s=0 

d 2 ( f P {D + tX + sY,u + tx + sy,z) 



dsdt 

f P (D,u,z 



dsdt 
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f P (D,u,z) 



t=s=0- 



dz 



df P {D + tX,u + tx,z) 



f p (D,u,z) 



dt 



df P (D + sY,u + sy,z) 



ds 



dz 



df P (D,u,z) df p (D,u,z) 



f P (D,u,z) d(X,x) 



d(Y,y) 



dz 



= ^\l)9 i nt((X,x),(Y,y)) 



verifies the Theorem. 



□ 



Calvo and Oiler studied a different metric on the space S n [7] . Their starting point was the metric 
g : P{n) x M n x M n -> R (D, X, Y) h-> - Trp" 1 XD~ X Y), 
where P{n) denotes the set ofnxn real, symmetric, positive definite matrices, and the embedding 
7T : M n x R" +1 x R+ -> P(n) (K, u, (3) ^ 



K + j3u o u (3u 
(3u (3 



The metric g has been studied by Siegel [33J , James [H] and Burbea [5] . Calvo and Oiler considered 
the pull-back metric of g by 7r restricted to the manifold M n x R n x {/?}, which is 



<Mm„xR"x{/5} : {M n xR n )xT n x T„ 



(K,u),(X,x),(Y,y) 



Tr(K XK Y)+/3(x, K^y). 
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If we use our parametrization of the normal distributions, namely the inverse of the covariance matrix 
and the expectation vector, then the metric is 

ff ( co <« :2„xT„xT„->M {D,u), (X,x), (Y,y) ^ i Tr(D~ 1 XD~ 1 Y) + f3{x, Dy). 

Lovric, Min-Oo and Ruh [35] studied a slightly different metric on the space 3 n . Let us sketch 
their fundamental idea briefly. Denote by Pi(n) the set ofnxn real, symmetric, positive definite 
matrices with determinant 1. Then the map 

j :M n xR n -»■ Pi(n+1) (K,u) » (detK)'^ ^ ^t°~ f 
is a smooth bijection. The special linear group has a natural smooth group action on Pi(ro) 

A : SL(n) — > Aut(Pi(n)) g i— > (mi—* gmg T 



where g T denotes the transpose of g. This group action represents P{n) as the Riemannian symmetric 
space SL(n)/SO(n) with SO(n) principal bundle 

SL(rt) — » P(n) g i— > gg T . 

This means that the space A4 n x R™ can be considered as a Riemannian symmetric space SL(n + 
l)/SO(n + 1). It is known in the theory of symmetric spaces, that the natural SL(n + 1) invariant 
metric on the space Pi(n+1) is given by restricting the Killing form of the simple Lie algebra sl(n+ 1) 
to the subspace So(n + 1) under the Cartan decomposition sl(n + 1) = o(n + 1) © So(n + 1). The 
generated metric is unique up to a positive constant factor. This metric at a point (K, u) £ M. n x R™ 
for tangent vectors (X, x), (Y,y) S T„ is given by the equation 

g ( ™-\(X,x),(Y,y)) = Tr(K- l XK- l Y) -^(Tr K^X)^ K^Y) + \{x,R-\). 

Using the inverse of the covariance matrix as a parameter, this metric is 

9%™ R \(X,x),(Y,y)) = TriD-'XD-'Y) -±_(T T D^X^Tr D^Y) + \{x,Dy). 

Corollary 3.1. On the parameter space of the special normal distributions we have the equality of 
the metrics 

On the parameter space of the normal distributions we have 

g (m= g (co,i )} 

but the metrics g^ F ' p \ g^ CO '^ and g( LMR } are pairwise incomparable in the sense that there is no 
(n, p) € Af parameter such that two of these metrics are equal up to a multiplicative factor. 

To work with the Riemannian metrics g( R \ g (T ' hq} , g (F ' q \ g [ip - q) , g^ co ^ and 5 ( LMR ) 
simultaneously we consider the metric 

g D JiX,x),(Y,y)) = ^Tr(D- l XD- l Y) + a(Tr D^X^Tr D^Y) + f3(x, Dy) 
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with parameters a, [3 el, a ^ — ^> P on the manifold 5 n . In the [3 = case we restrict the 
manifold to sl s ' . For every D e sl s ' and I,y eT„ we have a Cauchy-Schwarz inequality 

^(D^XD^Y) < Tr(D- 1 XD- 1 X)Tr(D- 1 YD- 1 Y). 

Substituting Y = D we have 

-I^D^XD- 1 *) - — (TrD^X) 2 > 0. 
It means that if a > — J- then o is a Ricmannian metric, if a < — J- then o is a semi-Riemannian 

An • J 1 An -* 

metric, and in the a = — ^ case g is a degenerated quadratic form. The Theorems and proofs are 
valid for semi-Riemannian metrics too, so we have just one condition a^-^. 

4 Geodesies 

In this section we derive the differential equation of the geodesic lines in the space M n and we present 
some solutions. 

Theorem 4.1. A curve j : R — > E n , j(t) = (D(t),u(t)) is a geodesic curve if and only if for every 
t € D01117 

D{t) = b{t)D(t)^D{t)+P{D{t)u{t)) o (D(t)u(t)) -^L- m ,D(t)u(t))D(t) (18) 
u(t) = -D^D^ySt) 

holds. 

Proof. Denote by GL(n) the set of invertiblc n x n matrices and define the reciprocal function as 

i : GL(n) -» GL(n) D^D- 1 . 

At the point -D the tangent space T^> GL(n) can be identified with the set ofnxn matrices Mat(n). 
The derivative of the inversion function is 

di ■ GL(n) — > Lin(Mat(n), Mat(n)) (D) i-> (d «)(£*) = (a (di)(£>)(^) = -D^AD" 1 ) . 

This leads to the derivative of the metric 

d g : S n -» Lin(T„, Lin(T„ x T„, R)) 

(£>,«) -> (fe) >-> (((*,£). >-> d^, fi (z,«)((x,2), (Yiy)))), 

where 

d.gD, M (Z,z)((A,x),(y,y)) = -iTr#^ 

- a Tr(£r 1 Z£r 1 X) Tr^^Y") - aTrfXT 1 ^ ^{D' 1 ZD^Y). 
At a given point (D,u) e 5„ for given tangent vectors (X,x), (Y,y) £ T„ the map 

r (D,u),(X,x),(Y,y) ■ T„ — > M 

^ ^(d5zj, fi (Yi»)((X,x), (Z, z)) +dg D ,u(X,x)((Y,y), (Z,z)) - dgojZ, z)((X,x), 
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is a linear functional. It means that there exists a unique tangent vector Vm u ^rx,x),(Y,y) S T„ such 
that for all vectors (Z, z) G T„ 

9D,u(V( Dj u),(X,x),{Y,y)i (Z,Z)) — T(D,u),(X,x),(Y,y)(Z,Z_) 

holds. One can define the map 

r : S n -> Lin(T„ x T„,T„) (D,u) i-> U(X,z), (Y,y)) i-> Vp.uMX.x),^)) 
which is called covariant derivative. It means, that the equation for all tangent vectors (Z,z) € T„ 

ff D,„(r (A2i) (x,^)(y,y),(z^)) = - ^(D-'fXD-'y + (19) 

- I Tr(Z)- 1 (X£)- 1 y + YD^Z)) Tr(D~ 1 Z) 
+ ^((y,Xz} + {x,Yz)-{x,Zy)) 
determines the covariant derivative. Let us write the covariant derivative in the form of 
r (D ^(X,x)(Y,y) = (-^(XD^Y + YD- 1 X) + DW,D~h 

for some W G T n and w G M. n . Then Equation JT5]) is 

^TriWD^Z) + a TtWTtD^Z + 0{w, z) = ~({y,Xz) + (x.Yz) - (x,Zy)). (20) 

Let us introduce the notation for symmetrized diadic product, for vectors u,v£ R n 

uQv = uov + vou, 

that is, the components of the n x n matrix (uQv) are {uQv)ij = UjM-j +%^- Equation (120p means 
that the vector component of the covariant derivative is 

™ = \(Xy + Yx) 

and the remaining matrix part is 

W = -^(xQy)D + jE 

where the parameter 7 can easily be found. Combining the terms together we have the following 
expression for the covariant derivative. 

r ( D,n) (X, x) (y, y) = (- 1 (XD-'Y + YD^X) -^(x0 y)D+ i (x, Dy)D, ^^(Xy + Yx) 

(21) 

A curve 7 : R — > 3 is called a geodesic curve if 

Vi G Dom( 7 ) : 7 (t) + r 7(t)) ( 7 (t))( 7 (t)) = 

holds. Consider a curve 7 : M — » 3, j(t) = (D(t),u(t)), substitute it into the equation of the geodesic 
curve and expand the covariant derivative, then according to Equation (|2ip we get Equation (|18p of 
the Theorem. □ 
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We have some remarks about the geodesic curves, which are only valid for Riemannian metrics. 

Remark 4.1. Let us consider the case n = 1, and assume that a^-j-. Then the system of differential 
equations of the geodesic line is 

D(t) 2 8 , , D(t) , . 

The curve 7 : R — * Si 



2 , 2/,. . x /1 + 2q 



7(t) = ^— cosir (bt + c),a^j — ^ — tanh(c< + b) + d 
is a geodesic line. 

Assume that we have two points (D o,Uo), (Di,u^) in the space Si and assume that > u . Let 
us define the following quantities 



, D 3 [Dl I ^{x 2 y 2 + 1 - y 2 ) 2 + 4x*y* - (x 2 y 2 + 1 - y 2 ) 



a = J — — coshc b = — log (y(l — xe c )) d = u — a* — ^— — tanhc. 
V -Do VP 



Then the curve 7 : [0, 1] 



01 



2 ,2,. s /l + 2a 

1 1 



j(t) = cosh 2 (bt + c), a — - — tanh(6t + c) + d 

is a geodesic line, such that 7(0) = (m ,A)) and 7(1) = (w^-Di). Simple calculation shows that the 
distance between the points (u , Dq), (u 1 , D\) £ Si is 

d[(u ,D ),(u 1 ,D 1 ))= J ^/g lit )(j(t),m) dt = V2 + 4a|6|- 

The geodesic line and the Rao distance on the space of special normal distributions has been 
computed by Siegel [33] and Burbea The next Remark concerns their results. 

Remark 4.2. Let us consider the space of n dimensional special normal distributions S„ and assume 
that a/-^. Then the curve D : R — > S„ is a geodesic line if and only if 

D(t) = b^Dit^bit). 

Assume that we have two points Dq, D\ in the space E^K Then the curve 

7 : [0, 1] -> E t» D$ exp (t log (d^ DiD~ *)) D* (22) 

is a geodesic line, such that 7(0) = Dq and 7(1) = D\. The distance between the points Dq, -Di € S^ 5 " 1 



d(D ,Di) = J v / fl7 ( t) (7(*),7(*)) di=yiTrlog 2 (^ ^ii? ^ + a Tr 2 log (z? 
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Remark 4.3. In the a = case the metric is the pull-back of the Siegel metric by the embedding 



tt^ : H„ -> P(n + 1) (D,u) ' \ q u Q 

The equation of the geodesic line in the Siegel metric is given by Equation (|22[) . If we have two points 
{DqjUq), (Di,Ui) G S„ then we define the matrices Si = tt/j (Di, u^) (i — 0,1), the equation of the 
geodesic line is 

7 : [0,1] -f S t -> np 1 [sj exp (tlog (s^SxS^)) sf 
and the distance between the points is 



d((A,«o),(Di,t*i)) = y^ Trl °S 2 ( 5 o W 

Remark 4.4. In the a = case we can more exact parametrization of the geodesic line in some 
special cases. Assume that B,C £ M. n are diagonal matrices, A, D_ G R™ are vectors such that the 
components of A are equal and U is an n x n orthogonal matrix such that UA = A. Then the curve 

2 

iiii 2 v 'V/3' 



t^> ( cosh 2 (Bt + C)!/- 1 , J -U t&nh(Bt + C)A + D 



is a geodesic line. The distance between the points j(to) and 7(ti) (to, £i S R + ) is 
d(7(*o),7(*i))= / 1 \/s7(t)(7(*)»7(*)) d* = |*i -t \V2TrB\ 

Jto v 

5 Curvatures 

Since Efron clarified the statistical meaning of the curvature, different curvature tensors has been 
studied on statistical manifolds. For curvatures on the space of normal distributions see for example 
Amari [2J[3], Siegel [33], Burbea [gj, Skovgaard :34j, Calvo and Oiler 7 , Lovric Min-Oo and Ruh [3J. 

Theorem 5.1. For every point (D,u) £ S„ and for every tangent vectors (X,x), (Y,y), (Z,z) £ T n 
the Riemann curvature tensor is 



R(D,u)((X,x),(Y,y),(Z,z)) = 



- 1 ZD^XD^Y + YD~ X XD~ X Z - XD^YD^Z - ZD~ X YD~ X X 



+ j(YxQDz-XyQDz + DyQZx-DxQ Zyj + (( X E ~ Y ^ £ + & Z U) ~ fa Z ^)) D ^ 

i (l)" 1 (YD^X - XD- x Y)z + D- 1 ZD- 1 {Xy - Yx)j + | (z x)Dy -(z0 y).Dx) 



a/? 



■{{y,Dz)x- (x,Dz)y 



1 + 2na 

Proof. The derivative of the covariant derivative is 
dT :S„ -> Lin(T„,Lin(T„ x T„,T„)) 

((X,x)^(((Y,y),(Z,z))^dr (D ^(X,x)(Y,y)(Z,z)) 



(23) 
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where from Equation (|21[) 

dr (Aa) (z, y) ^(xd^zd^y + yd^zd^x) - 1 (z(z y)D + £>(z y)z) 

+ rT^ G-' Z ^ )D + ^ D ^ Z ) ~ \D- x ZD- x {Xy + Yx). 
The Riemann curvature tensor is defined to be 

R : E n -» Lin(T„ x T„ x T„,T„) (D.u) h-> (((X, z)(y 2/)(Z z)) » R [D ^{{X,x), (Y,y), (Z,z))), 
where 

R(D,u) ((A, x), (Y, y), (Z, z))=d r (As) (X, x) (Y, y) (Z, z) - d T (D&) (Y, y) (X, x) (Z, z) 

+ r (D,u) ((x, x), r (Di „) (y, y) (z, 1)) - r (A „) ((y, y), r {D ^ (x, x)(z, z)) . 

We omit the details of the straightforward, but lengthy calculation of the curvature tensor. □ 

Theorem 5.2. For every point (D,u) G S n and for every tangent vectors (X,:r),(Y, y) e T n the 
Ricci curvature tensor is 

n + 1 1 3 

Ric (Dtyd ((X,x),(Y,y)) = —TriD-'XD-'Y) + - Tr^X) Tr^Y) - —-^— {x,Dy). 

(24) 

Proof. At a point (D,u) € 3 n for given tangent vectors (X,x), (Y, y) £ T„ the map 

^(23,2) 0, (X,x), (Y,y)) : T n -> T n (Z,z) ~ fl (A „ } ((Z,z), (X,x), (Y,y)) 
is linear, and its trace is the Ricci tensor 

Ric : S„ -> Lin(T„ x T„,R) (£> >a ) ^ (((A,x), (Y,y)) ^ Ric (A „)((A, x), (Y,y))), 

where 

Ric (2Jifi) ((X,s), (Y,y)) = Tri? (D ^(-, (X,x), (Y,y)). 

The elements Ric(£> jU )((X, a;), (X, a:)) determines the Ricci tensor. For the further calculation we 
fix the tangent vector (X, x) € T„. According to the Equation (|23p the Riemann curvature tensor 
consists of six summands. We compute the trace of the summands separately. Let us denote by By 
the usual system ofnxn matrix unit and define 

p.. — p.. . 4. p.. 

for indices 1 < i < j • < n. To compute the trace we choose the basis 

{/••*-!, , „ IJ i F ^x<i<j< n U fe} i= i,...,„ (25) 

in T„, where (ei)i—i t ,,, in is the canonical basis in K™. The trace of the first summand is 
1 " 

Pi =- Tr {XD- l E u D- l XE u - EuD^XD^XEu) 

i=l 

+ \ T±(XD- 1 F ij D- 1 XF ij -F ij D- 1 XD- 1 XF ij ) 

l<z<j<n 
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which is 

Pl = -^±1ti-d- 1 xd- 1 x + -TtX 2 d- 2 + -Tt 2 d- 1 x. 

H 4 4 4 

The trace of the second summand is 

92 = -^Tr(B tt (£ I4 a;0fe)) - | ^ Tri^-(x O Dxji^ - -/jILtl^Dx). 

2=1 1 <n 

The third summand gives 

i— 1 l<i<j<n 

The trace of the forth, fifth and sixth summand is 

n 

Pi = - ^2(e ll ~D- 1 XD- 1 Xe,) = — TrD^XD^X 

i=l 

(3 -J 1 ■> n — 1 

P5 = — };{ej, (xQe l )Dx - (xQx)Dei) = (3 — — (x, Dx) 

i=l 

Pa = -r~r: — y2( e ^ Dx)e t - (e t , Dx)x) = "f^" - (x, Dx). 
1 + 2na 1 + 2na 

i—l 

Adding the traces we have the diagonal element of the Ricci tensor 

Bi C(Dil!d ((X,x),(X,x)) =J2pi = Tr D- X XD- X X + 1 Tr 2 D^X - - (x, Dx). 

i—l 

Using the polarization formula 

Ric ( D,n)((A,x), {Y,y))=^(mc (D>li) ((X+Y,x+y), (X+Y,x+y)yRic {D ^((X-Y,x-y), {X-Y,x-y)jj 

we get Equation ((24}. □ 
The next Theorem shows that the manifolds 3 n and S„ has constant scalar curvature. 

Theorem 5.3. for every point D £ sj, t/ie scalar curvature of the space of special normal 
distributions is 

n(2(n - l)(n + l)(n + 2)q + ?i 2 + 2n - l) 

Scal,P) = -^ 4(IT^Wj < 26 > 

and /or every point (D 7 u) £ H„ i/ie space of normal distributions is 



n 



(n + 1) (2(n + 2)(n - l)a + n + l) 



Scal ^'^ = 4(1 + 2™) ~ < 27 > 

Proof. At a point (D,u) G H n for given tangent vector (A, x) € T n the map 

T„^R (Y,y)^Ric (D ^((X,x),(Y,y)) 
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defines a linear functional. So there exists a unique (X, x) E T n tangent vector, such that 

9(D,u) 

((X,x), (Y,y)) = Ric {D ^((X,x), (Y,y)) 
holds for every tangent vector (Y, y) E T„ . Let us define the map 

Ric : H„ — > Lin(T„, T„) (D,u) h-> ((X,x) n-f 

The explicit expression 

Ric (D U) (X, x) = -^±lx + H2(re+1 !% Tr(D-iX) * - a (28) 

(D '- )V '- ; 2 2(l + 2na) 1 7 2(l + 2na)~ V 7 

can be easily verified. The scalar curvature of the manifold is the trace of the map Ric 

Scal:H„->M (D,u) h-> TrRfo£> ia . 

Using the basis ([23|) the trace of the three summand in the Equation (|28|) is 



/>2 



i— i i<i<j<n 
l + 2(n + l)a 



2(l + 2na) ^ v ' v y 4(l + 2na) 

V ; i=l V ' l<i<j<n 

l + 2(n + l)a 
4(1 + 2na) 



P3 



1 " 
4_ Onn\ E^ 



2(l + 2na) ^ 2(l + 2na) 

The scalar curvature of the manifold S n at a point (D,u) € S„ is 

Scal(£>,u) = pi + p 2 + p' 3 
and the scalar curvature of the space of special normal distributions at a point D G S« is 

Scal s (D,u) = p'i + p' 2 - 

6 Conclusion 

Finally we have some remarks about the geometry of the generalized Gaussian distributions. 

Remark 6.1. For every pair (n,p) £ J\f, where p < 2 the scalar curvature of the space of extended 
Gaussian distribution endowed with the Fisher information metric at every point is 

Scal = - %tntp-l)) {{n + 2){n - 1)iP + 2{n + 



□ 



We note that the parameter p is in the interval 



n o 
n+2 ' Z 



it means that the scalar curvature is a 



monotonously increasing function of p. The scalar curvature at a given point is connected to the 
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statistical distinguishability of the point from it's neighborhood, since the first nonconstant term in 
the Taylor expansion of the volume of the geodesic ball is the scalar curvature 



This idea is widely used in quantum information geometry and in that framework it is due to Petz 
|29j . In this classical setting this means, that the parameter p modulates the statistical properties of 
this manifold. Namely, in the p — *■ 2 limit the manifold is more homogenous and it is more difficult to 
distinguish close points in the p — ► limit; it is easier to decide whether two points are identical 
or just close to each other. This can have relevance in hypothesis testing. 

Remark 6.2. Consider the space of special normal distributions and the Fisher information matrix 



Surprisingly from this well-known classical metric one can easily recover some metrics which are 
frequently used in quantum information theory. In quantum setting just the trace one matrices of 
S„ are considered. For example the Riemannian metrics 



are very important ones in quantum setting, they are called Kubo-Mori [13, 28 metric and largest 
metric. This kind of differential geometrical connections can help to understand and to interpret the 
geometrical invariants of the quantum information manifolds. 
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